Reclassifying subcategorization frames for experimental analysis and stimulus generation
نویسندگان
چکیده
Researchers in the fields of psycholinguistics and neurolinguistics increasingly test their experimental hypotheses against probabilistic models of language. VALEX (Korhonen, Krymolowski & Briscoe, 2006) is a large-scale verb lexicon that specifies verb usage as probability distributions over a set of 163 verb SUBCATEGORIZATION FRAMES (SCFs). VALEX has proved to be a popular computational linguistic resource and may also be used by psychoand neurolinguists for experimental analysis and stimulus generation. However, a probabilistic model based upon a set of 163 SCFs often proves too fine grained for experimenters in these fields. Our goal is to simplify the classification by grouping the frames into genera---explainable clusters that may be used as experimental parameters. We adopted two methods for re-classification. One was a manual, linguistic approach derived from verb argumentation and clause features; the other was an automatic, computational approach driven from the graphical representation of SCFs for use in Natural Language Processing technology. The premise was not only to compare the results of two quite different methods for our own interest, but also to enable other researchers to choose whichever re-classification better suited their purpose (one being grounded purely in theoretical linguistics and the other in practical language engineering). The various classifications are available as a free online resource to researchers.
منابع مشابه
A Corpus-based Conceptual Clustering Method for Verb Frames and Ontology Acquisition
We describe in this paper the ML system, ASIUM, which learns subcategorization frames of verbs and ontologies from syntactic parsing of technical texts in natural language. The restrictions of selection in the subcategorization frames are filled by the concepts of the ontology. Applications requiring subcategorization frames and ontologies are crucial and numerous. The most direct applications ...
متن کاملA procedure to automatically enrich verbal lexica with subcategorization frames
In this paper we introduce a method for automatically assigning subcategorization frames to previously unseen verbs of Spanish, as an aid to syntactical analysis. Since there is not a consensus on the classes of subcategorization frames, we combine supervised and unsupervised learning. We apply clustering techniques to obtain coarse-grained subcategorization classes from an annotated corpus of ...
متن کاملARTÍCULO A procedure to automatically enrich verbal lexica with subcategorization frames
In this paper we introduce a method for automatically assigning subcategorization frames to previously unseen verbs of Spanish, as an aid to syntactical analysis. Since there is not a consensus on the classes of subcategorization frames, we combine supervised and unsupervised learning. We apply clustering techniques to obtain coarse-grained subcategorization classes from an annotated corpus of ...
متن کاملA Subcategorization Frames Acquisition System for French Verbs
This paper presents a system intended to automatically acquire subcategorization frames (SCFs) of verbs from the analysis of large corpora. The system has been applied to a newspaper corpus (made of 10 years of the French newspaper Le Monde) and acquired subcategorization information for 3267 verbs. 286 SCFs were dynamically learnt for these verbs. From the analysis of 25 representative verbs, ...
متن کاملBengali Verb Subcategorization Frame Acquisition - A Baseline Model
Acquisition of verb subcategorization frames is important as verbs generally take different types of relevant arguments associated with each phrase in a sentence in comparison to other parts of speech categories. This paper presents the acquisition of different subcategorization frames for a Bengali verb Kara (do). It generates compound verbs in Bengali when combined with various noun phrases. ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012